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Technical  Report 
Visual  Navigation  &  Space  Perception 
M.S.  Banks 

During  the  last  two  and  a  half  years,  we  worked  on 
three  general  problems:  Surface  perception,  heading 
perception,  and  visual-haptic  integration.  In  this 
progress  report,  we  review  the  work  leading  up  to  our 
current  work  (thus  some  of  the  material  appeared  on 
last  year's  progress  report)  and  then  we  discuss  the 
work  completed  this  past  year. 

1.  Surface  Perception 

The  problem  of  visual  space  perception  is  the 
recovery  of  the  location,  shape,  size,  and  orientation  of 
objects  in  the  environment  from  the  pattern  of  light 
reaching  the  eyes.  The  visual  system  uses  disparities 
between  the  two  retinal  images  to  glean  information 
about  the  3-D  layout  of  the  environment.  In  the  last 
seven  years,  we  have  investigated  how  disparity  is 
used  to  recover  surface  orientation.  Most  of  the  work 
has  concerned  determining  the  slant  of  an  isolated 
surface  rotated  about  a  vertical  axis.  This  problem  is 
interesting  because  the  pattern  of  disparities  depends 
not  only  on  slant,  but  also  on  location  relative  to  the 
head  (Ogle,  1950). 

The  first  part  of  this  section  is  basically  the  same  as 
last  year's  progress  report  because  we  need  to  explain 
the  background  to  the  work  we  accomplished  during 
the  two  and  a  half  year  grant  period.  If  you  have 
already  read  this  backgroimd  material  from  previous 
progress  reports,  you  can  skip  ahead  to  page  5. 

Figure  1.1  depicts  the  geometry  for  binocular 
viewing  of  a  vertical  plane.  The  objective  gaze-normal 
surface  is  the  plane  perpendicular  to  the  cyclopean  line 
of  sight.  The  slant  S  is  the  angle  by  which  the  plane  of 
interest  is  rotated  about  a  vertical  axis  from  the  gaze- 
normal  surface. 


Figure  1.1.  Binocular  viewing  geometry.  See  text. 


What  signals  are  available  for  slant  estimation?  One 
important  signal  is  horizontal  disparity.  For  a  smooth 
surface  slanted  about  a  vertical  axis,  the  horizontal 
disparity  pattern  can  be  represented  locally  by  the 
horizontal  size  ratio  {HSR;  Figure  1.1),  the  ratio  of 
horizontal  angles  the  patch  subtends  in  the  left  and 
right  eyes  (Rogers  &  Bradshaw,  1993).  Changes  in  HSR 
produce  obvious  and  immediate  changes  in  perceived 
slant,  so  this  signal  must  be  involved  in  slant 
estimation.  However,  HSR  by  itself  is  ambiguous.  To 
illustrate  the  ambiguity.  Figure  1.2  shows  several 
surface  patches  that  give  rise  to  HSRs  of  1  and  1.04.  For 
each  HSR  value,  there  is  an  infinitude  of  possible  slants 
depending  on  the  surface’s  location.  Clearly,  the 
measurement  of  HSR  alone  does  not  allow  an 
imambiguous  estimate  of  the  surface's  orientation  nor 
do  any  other  descriptions  of  horizontal  disparity 
(Longuet-Higgins,  1982).  A  main  purpose  of  our  work 
has  been  to  determine  what  other  signals  are  used,  in 
combination  with  horizontal  disparity,  by  the  visual 
system  and  to  determine  how  those  signals  are 
combined  to  determine  surface  slant. 


Figure  1.2.  Ambiguity  of  HSR.  Plan  view  with  the  abscissa 
representing  lateral  position  and  the  ordinate  forward 
position.  The  line  segments  represent  surface  patches  that 
give  rise  to  HSR  =  1  (upper  panel)  and  HSR  =  1.04  (lower 
panel). 

Another  potentially  useful  signal  is  vertical  disparity 
which  can  be  represented  by  the  vertical  size  ratio 
{VSR;  Figure  1.1),  the  ratio  of  vertical  angles  subtended 
by  a  surface  patch  in  the  left  and  right  eyes.  VSR  varies 
with  the  location  of  a  surface  patch  relative  to  the  head, 
but  does  not  vary  with  surface  slant  (Gillam  & 
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Lawergren,  1983).  The  circles  in  Figure  1.3  show  the 
VSR  at  various  locations  in  the  visual  plane.  Another 
sig;nal  is  the  rate  of  change  in  VSR  with  azimuth 
);  this  signal  depends  strongly  on  distance 
and  less  so  on  slant. 

Other  useful  signals  are  provided  by  sensed  eye 
position.  Ignoring  torsion,  each  eye  has  one  degree  of 
freedom  in  the  visual  plane.  We  can  thus  represent 
binocular  eye  position  by  two  values,  y  and  ji,  the 
version  and  vergence  of  the  eyes  (Figure  1.1). 


Figure  1.3.  IsoVSR  contours.  Plan  view.  Abscissa  represents 
lateral  position  and  ordinate  forward  position.  Each  contour 
represents  the  regions  in  space  for  which  VSR  is  constant; 
each  contour  represents  a  different  VSR. 

Finally,  useful  slant  information  can  be  gleaned 
from  nonstereoscopic  signals  such  as  the  texture 
gradient  created  by  projection  onto  the  retinae  of 
surfaces  with  statistically  regular  textures  (Cutting  & 
Millard,  1984;  Buckley  &  Frisby,  1993;  Gumming  et  al, 
1993).  Such  cues  were  present  in  older  stereoscopic 
work  using  real  objects  (e.g..  Ogle,  1938;  Gillam  et  al, 
1988).  In  more  recent  work  with  computer  displays, 
there  is  still  generally  a  perspective  cue  that  indicates 
that  the  surface  is  frontoparallel  to  the  head  (e.g., 
Rogers  &  Bradshaw,  1995;  Howard  &  Kaneko,  1994). 
Neither  the  slant  specified  by  a  given  texture  gradient 
nor  the  uncertainty  of  the  estimation  varies  with 
distance  or  azimutii  (Sedgwick,  1986;  Backus  et  al, 
1999). 

An  unambiguous  estimate  of  slant  can  be  obtained 
from  various  combinations  of  the  above-mentioned 
signals.  For  example,  slant  can  in  principle  be  estimated 
from  HSR  and  sensed  eye  position  (Ogle,  1950;  Foley, 
1980).  From  Backus  et  al  (1999): 

S  =  -  tan-'  (—  In  HSR  -  tan  y) .  (1.1) 

M 

The  estimates  of  ju  and  y{ju  and  y)  are  presumably 
derived  from  extra-retinal,  eye-position  signals. 
Correcting  HSR  via  eye  position  has  the  important 
consequence  of  compensating  for  the  changes  in 
binocular  viewing  geometry  that  occur  with  changes  in 


distance  and  azimuth  (Kaneko  &  Howard,  1996;  Ogle, 
1950). 

Slant  can  also  be  estimated  from  retinal-image 
information  alone  (Carding,  et  al,  1995;  Gillam  & 
Lawergren,  1983;  Koenderink  &  van  Doom,  1976; 
Mayhew  &  Longuet-Higgins,  1982).  From  Backus  et  al: 

S  -  -tan  (—In - )  (1.2) 

JU  VSR 

where  //  can  be  measured  from  retinal  image  properties 
alone.  In  the  terminology  of  Carding  et  al  (1995),  ju 
"normalizes"  the  slant  (scales  HSR  for  changes  due  to 
viewing  distance)  and  VSR  "corrects"  the  slant 
(corrects  HSR  for  changes  due  to  azimuth). 

In  summary,  certain  subsets  of  signals  allow 
unambiguous  estimation  of  slant  and  we  can 
summarize  them  with  three  calculations  (Banks  & 
Backus,  1998a):  (1)  slant  estimation  from  HSR  and  eye 

position  (S^sR,Ep)'  (^)  slant  estimation  from  HSR  and 
VSR  (SfjsRysR)'  (^)  slant  estimation  from 

nonstereoscopic  cues  such  as  perspective  (Sp). 

In  natural  viewing,  the  slant  estimates  derived  from 
these  three  methods  should  on  average  agree. 
However,  each  signal  measurement  is  subject  to  error, 
so  even  in  natural  viewing,  the  estimates  will  differ. 
Because  a  surface  can  only  have  one  slant  at  a  time,  the 
visual  system  must  derive  one  estimate  from  the  set  of 
somewhat  discrepant  signals.  In  our  conceptualization, 
the  weight  associated  with  each  slant  estimate  is  a 
function  of  its  estimated  reliability,  and  the  estimated 
reliability  is  based  in  turn  on  the  quality  of  the 
information  present  in  the  signals  (e.g.,  Landy  et  al., 
1995;  Heller  &  Trahiotis,  1996).  Several  factors  influence 
signal  reliability.  For  example,  consider  the  effects  of 
increasing  viewing  distance.  As  distance  increases, 
there  is  no  effect  on  the  information  carried  by  the 
perspective  signal  (assuming  broadband  texture; 
Sedgwick,  1986),  but  the  information  carried  by  HSR  is 
reduced  because  a  given  set  of  slants  maps  onto  ever 
smaller  ranges  of  HSR.  Consequently,  nonstereoscopic 
slant  estimates  should  be  weighted  more  heavily 
relative  to  stereoscopic  slant  estimates  as  viewing 
distance  increases;  experimental  evidence  confirms  this 
expectation  (Buckley  &  Frisby,  1993;  Backus  &  Banks, 
1999). 

Some  of  our  experiments  examined  whether  the 
signals  described  above  are  used  in  estimating  slant, 
and  how  the  weights  assigned  to  the  estimates  vary 
across  viewing  conditions  and  stimulus  properties. 

To  do  these  experiments,  we  built  a  haploscope  that 
allows  independent  manipulation  of  eye  position  and 
disparity.  We  examined  the  use  of  the  two  stereoscopic 
means  of  slant  estimation  described  above.  (We  made 
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nonstereo,  perspective  information  uninformative  by 
using  a  '"back  projection"  procedure;  Banks  &  Backus, 
1998a.)  Observers  rotated  a  stereoscopic  random-dot 
plane  about  a  vertical  axis  until  it  appeared  normal  to 
the  line  of  sight:  that  is,  they  adjusted  its  slant  imtil  it 
was  apparently  gaze  normal.  Real  and  simulated 
versions  were  varied  from  15"  to  the  left  of  head-centric 
straight  ahead  to  15"  to  the  right.  Real  version  was 
varied  by  turning  the  haploscope  arms  so  that  the 
observer  rotated  the  eyes  to  the  desired  version 
position.  Simulated  version  was  varied  by  altering  the 
disparity  field.  Thus,  an  observer  might  look  at  a  stereo 
plane  with  eyes  rotated  leftward  while  the  disparities 
presented  were  as  if  the  eyes  were  rotated  rightward.  If 
the  visual  system  relies  on  extra-retinal,  eye-position 

signals  {Sj^sr^ep'  slant  estimation  by  HSR  and  eye 

position)  in  estimating  the  slant  of  a  stereoscopic 
surface,  then  the  observers'  settings  would  be  predicted 
from  their  actual  eye  positions;  these  predictions  are 
represented  by  the  diagonal  line  in  the  left  panel  of 
Figure  1.4.  If,  on  the  other  hand,  the  system  uses  the 
information  contained  in  the  disparity  field  alone 

{SfjsRysR'  estimation  by  HSR  and  VSR),  the  settings 

would  be  predicted  by  the  simulated  eye  positions; 
these  predictions  are  represented  by  the  three 
horizontal  lines  (one  for  each  of  three  simulated 
eccentricities)  in  the  left  panel  of  Figure  1.4. 


Figure  1.4.  Predictions  and  results.  Backus  et  al  (1999). 
Natural  log  of  HSR  settings  is  plotted  as  a  function  of 
version.  Left  panel:  Predictions.  Slant  estimation  by  HSR  and 
eye  position  predicts  the  diagonal  line.  Estimation  by  HSR 
and  VSR  predicts  the  three  horizontal  lines  (one  for  each 
VSR).  Right  panel:  Results  from  one  of  3  observers.  Squares, 
circles,  and  squares  represent  results  with  different  VSR 
values. 


compensation  for  eccentric  viewing  is  based  primarily 
on  ^e  pattern  of  horizontal  and  vertical  disparities 
within  the  images  and  little  on  actual  eye  position.  We 
can  summarize  these  findings  by  expressing  the  slant 
estimates  as  weighted  averages  of  the  signals  presented 

^  A. 

to  the  visual  system:  S  =  ^ny^HSRysR  '^^h,e^hsr,ep  where 
the  w's  represent  the  associated  weights.  We  can  ignore 
the  nonstereo  slant  estimator  in  this  experiment  (not 
expressed  in  equation)  because  it  always  specified  a 
slant  of  0  and  thereby  could  have  no  influence  in  a 
slant-nulling  task.  The  data  in  Figure  1.4  can  be  fit  well 
by  this  model  if  =.85  and  ^^  ^=.15. 

The  magnitudes  of  vertical  disparities  at  a  given 
azimuth  are  roughly  proportional  to  elevation  above 
the  visual  plane  {VSR  is,  however,  constant  in  the  Fick 
coordinates  we  use  for  our  equations).  Thus,  surfaces 
that  subtend  a  small  vertical  angle  do  not  create  large 
vertical  disparities.  We  took  advantage  of  this  by 
reducing  stimulus  height. 

The  results  for  one  observer  are  shown  in  Figure  1.5. 
Stimulus  width  was  always  40",  but  the  height  varied 
from  0-30"  (left  to  right  in  the  figure).  When  the  height 
was  30",  we  again  found  that  slant  settings  were 
determined  almost  exclusively  by  S^sRysR  •  However,  as 
stimulus  height  was  reduced,  the  slant  settings  became 
more  and  more  consistent  with  S^^sr^ep  •  Fhially,  with  a 
stimulus  height  of  0"  (horizontal  row  of  dots),  slant 
settings  were  predicted  entirely  by  Sf^^R^EP  /  thus,  as  the 
eyes  turned,  different  patterns  of  disparity  were 
required  for  a  gaze-normal  percept. 

These  results  show  clearly  that  the  human  visual 
system  employs  two  means  of  estimating  slant  of 
stereoscopically  defined  surfaces.  The  weight  given 
SfjsRysR  is  high  when  the  stimulus  is  large  and  contains 
measurable  vertical  disparities.  The  weight  given 
^HSR,EP  is  high  when  the  stimulus  is  short  and  does  not 
contain  readily  measurable  vertical  disparities. 


li 


The  results  are  displayed  in  the  right  panel  of  Figure 
1.4.  The  data  agree  quite  well  with  the  predictions 

of  Sf^sRysR  •  actual  version  of  the  eyes  had  no  clear 
effect  on  slant  settings  which  is  counter  to  the 
predictions  Thus,  with  large  targets. 


Figure  1.5.  Slant  settings  for  different  stimulus  heights. 
Natural  log  of  HSR  is  plotted  in  each  panel  as  a  function  of 
version.  Panels  from  left  to  right  show  data  when  stimulus 
height  varied  from  0-35°.  Predictions  (see  Figure  4)  are  also 
shown  for  two  means  of  slant  estimation. 
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The  work  described  above  focused  on  estimation  of 
surface  slant  about  a  vertical  axis.  Naturally,  the  visual 
system  must  estimate  slant  about  any  axis,  not  just  the 
vertical.  One  can  show  that  the  slant  and  tilt  of  a 
smooth  surface  can  be  recovered  locally  from  estimates 
of  the  slant  component  about  the  vertical  axis  (tilt  =  0") 
and  the  component  about  the  horizontal  axis  (tilt  =  90") 
(Backus  et  al,  1999).  Thus,  we  investigated  slant 
estimation  about  the  horizontal  axis  and  applied  what 
we  learned  to  estimation  about  arbitrary  axes.  We 
completed  a  paper  on  this  topic  during  the  grant  period 
(Bariks,  Hooge,  &  Backus,  2001),  so  we  describe  that 
work  here. 


Figure  1.6.  Binocular  viewing  geometry  for  estimating 
surface  orientation.  Left  panel:  Definitions  of  slant  and  tilt. 
A  binocular  observer  is  viewing  a  slanted  plane.  The 
Cyclopean  line  of  sight  is  represented  by  the  line  segment 
between  the  midpoint  between  the  eyes  and  the  fixation 
point,  which  is  the  center  of  the  slanted  plane.  The  large 
green  plane  is  perpendicular  to  the  cyclopean  line  of  sight 
and  represents  the  gaze-normal  plane  (for  which  slant  =  0). 
The  gray  stimulus  plane  is  rotated  with  respect  to  the  gaze- 
normal  plane.  Slant  is  the  angle  between  its  surface  normal 
and  the  cyclopean  line  of  sight.  Tilt  is  the  angle  between  the 
horizontal  meridian  and  the  projection  of  the  surface 
normal.  Slant  axis  is  the  intersection  of  the  gaze-normal 
plane  and  the  stimulus  plane  and  corresponds  to  the  axis 
about  which  the  stimulus  plane  is  rotated  relative  to  the 
normal  plane.  Right  panel:  Slant  about  a  horizontal  slant 
axis;  tilt  =  90  deg.  The  eyes  are  fixating  the  middle  of  the 
stimulus  plane.  The  eyes'  vergence  (jx)  is  the  angle  between 
the  lines  of  sight. 

The  horizontal  disparity  pattern  associated  with 
slant  about  a  horizontal  axis  (right  panel  of  Figure  1.6) 
can  be  represented  locally  as  a  horizontal-shear 
disparity.  Ogle  and  Ellerbrock  (1946)  defined  this 
disparity  as  follows.  A  line  through  the  fixation  point 
and  perpendicular  to  the  visual  plane  is  a  vertical  line. 
There  is  a  horizontal  axis  through  the  fixation  point,  in 
the  visual  plane,  and  parallel  to  the  interocular  axis. 
We  rotate  the  vertical  line  about  this  axis  and  project 
the  images  of  the  line  onto  the  two  eyes.  The 
horizontal-shear  disparity  (H^)  is  the  angle  between 
the  projections  of  the  line  in  the  two  eyes.  If  the  eyes 


are  torsionally  aligned  (i.e.,  the  horizontal  meridians 
of  the  eyes  are  co-planar)  and  fixating  in  the  head's 
median  plane,  slant  about  a  horizontal  axis  is  given 
by: 


S  = 


-tan-'[- 


sin(tan 


2d 


-] 

)) 


(1.3) 


where  S  is  the  slant,  i  is  the  interocular  distance,  and  d 
is  the  distance  to  the  vertical  line's  midpoint.  When 
the  distance  to  the  surface  is  much  greater  than  the 
interocular  distance,  slant  is  given  to  close 
approximation  by: 


S  ~-tan 

M 


(1.4) 


where  ju  is  the  eyes'  horizontal  vergence  (right  panel. 
Figure  1.6).  Thus,  estimating  slant  about  a  horizontal 
axis  is  straightforward  when  the  eyes  are  aligned:  the 
visual  system  must  only  measure  the  pattern  of 
horizontal  disparity  (quantified  by  and  the 
vergence  distance  (//,  which  could  also  be  measured 
by  use  of  the  pattern  of  vertical  disparities;  Rogers  & 
Bradshaw,  1995;  Backus  et  al.,  1999). 

The  eyes,  however,  are  not  torsionally  aligned  in 
all  viewing  situations.  Specifically,  the  eyes  can  rotate 
about  the  lines  of  sight;  cyclovergence  refers  to 
rotatioriis  in  opposite  directions  in  the  two  eyes.  Let  r 
represent  cyclovergence  in  Helmholtz  coordinates. 
Intortion  (r  <  0;  tops  of  the  eyeballs  rotated  toward 
one  another)  occurs  with  downward  gaze  at  a  near 
target  and  extorsion  (r  >  0)  with  upward  gaze 
(Somani,  DeSouza,  Tweed,  &  Vilis,  1998).  Figure  1.7 
illustrates  how  the  resulting  torsional  misalignment 
alters  the  horizontal  disparities  at  the  retinas.  In  each 
panel,  there  is  a  horizontal-shear  disparity  created  by 
the  stimulus.  We  will  refer  to  this  as  H^,  a  head-centric 
value,  in  order  to  distinguish  it  from  the  retinal  shear 
disparity  H^.  In  the  upper  row,  the  eyes  are  torsionally 
aligned  (r=  0)  and  are  fixating  a  frontoparaUel  plane. 
Hg  is  0  near  the  fixation  point.  Slant  can  be  recovered 
from  equations  (1)  and  (2).  In  the  middle  row,  the  eyes 
are  again  torsionally  aligned,  but  the  plane  is  now 
slanted  about  a  horizontal  axis  (S  <  0;  Hg  >  0;  r  =  0); 
again  slant  can  be  recovered  accurately  from 
equations  (1)  and  (2).  In  the  lower  row,  the  plane  is 
slanted  by  the  same  amount  as  in  the  middle  row,  but 
the  eyes  are  extorted.  The  shear  disparity  at  the  retinas 
is  =  Hg  -  r.  Thus,  a  particular  combination  of  slant 
and  extortion  creates  a  pattern  of  horizontal-shear 
disparity  identical  to  the  pattern  created  by  a 
frontoparaUel  plane  when  the  eyes  are  aligned  (upper 
row).  If  we  do  not  know  the  torsional  state  of  the  eyes. 


5 


Visual  Navigation  and  Space  Perception,  M.S.  Banks 


the  slant  specified  by  is  ambiguous  (Ogle  & 
EUerbrock,  1946;  Howard  &  Kaneko,  1994). 


Figure  1.7.  Cyclovergence  affects  the  relationship  between 
slant  and  horizontal-shear  disparity.  In  each  of  the  three 
panels,  the  left  side  depicts  the  viewing  situation  and  the 
right  side  the  shear  disparities  at  the  retinas.  Upper  panel: 
The  observer  is  viewing  a  frontoparallel  plane  with  the  eyes 
torsionally  aligned  (r=  0).  The  horizontal-shear  disparity  is 
0.  (Note  that  we  have  not  shown  the  gradients  of  vertical 
disparity  that  would  occur  with  the  viewing  of  objects  at 
non-infinite  distances.)  Middle  panel:  The  plane  is  slanted 
about  a  horizontal  axis  (slant  <  0)  which  creates  a  positive 
horizontal-shear  disparity.  Horizontal-shear  disparity  is  the 
difference  between  the  orientations  of  the  images  of  a 
vertical  (right  eye  minus  left  eye):  -  /2  -  /2  =  - 

Lower  panel:  The  plane  is  again  slanted  about  a  horizontal 
axis,  but  the  eyes  are  also  extorted  (r  >  0)  such  that  the 
horizontal-shear  disparity  is  0.  If  the  visual  system  did  not 
compensate  for  the  horizontal  shear  created  by 
cyclovergence,  slant  would  be  misestimated. 

The  need  to  compensate  for  changes  in  the  eyes' 
horizontal  vergence  and  cyclovergence  is  further 
illustrated  in  Figure  1.8.  Each  panel  shows  the  slant 
estimate  obtained  from  equation  (1.3)  as  a  function  of 
distance  (which  can  be  estimated  from  //).  The 
horizontal-shear  disparity  observed  at  the  retinas  (H^) 
is  0,  -1,  and  -2  deg  in  the  upper,  middle,  and  lower 
panels,  respectively.  Each  panel  shows  five  curves  that 
correspond  to  the  estimate  from  equation  (1.3)  for 
cyclovergences  of  -4,  -2,  0,  2,  and  4  deg.  The  correct 
surface  slant  is  indicated  by  the  thick  curve  in  each 
panel  (r=  0).  Estimates  obtained  from  equation  (1.4) 
are  indicated  by  the  open  circles.  Clearly,  failure  to 
compensate  for  cyclovergence  can  have  a  profound 
effect  on  the  estimated  slant;  for  example,  at  a  distance 
of  100  cm,  the  estimation  error  is  ’-47.5,  -28.6,  0,  28.6, 
and  47.5  deg  for  cyclovergences  of  4,  2,  0,  -2,  and  -4 
deg,  respectively.  Likewise,  failure  to  compensate  for 
changes  in  horizontal  vergence  (correlate  of  distance) 
can  have  a  large  effect  on  the  slant  estimate;  for 


example,  when  =  -2  deg  (lower  panel)  and  the  eyes 
are  torsionally  aligned  (r=  0),  the  correct  slant  varies 
from  --0  deg  at  very  near  distances  to  47.5  deg  at  200 
cm.  Here  we  ask  whether  the  visual  system 
compensates  for  changes  in  cyclovergence  and 
horizontal  vergence  and,  if  it  does  compensate,  the 
means  by  which  the  compensation  is  accomplished. 


Slant  Estimates 
with  Cyclovergence 


Figure  1.8.  Slant  estimates  as  a  function  of  distance,  slant, 
and  cyclovergence.  Each  panel  plots  the  slant  estimate  as  a 
function  of  distance  for  a  given  horizontal-shear  disparity 
(H^).  The  upper,  middle,  and  lower  panels  show  the 
estimates  when  =  0,  -1,  and  -2  deg,  respectively.  The  true 
slant  in  each  panel  is  indicated  by  the  black  curve.  The  five 
curves  in  each  panel  represent  the  estimates  when  the 
cyclovergence  {f)  is  -4,  -2,  0,  2,  and  4  deg.  The  slant 
estimates  derived  from  Equation  (1)  are  indicated  by  the 
thin  colored  curves  and  the  estimates  derived  from 
Equation  (2)  by  the  small  circles.  Equation  (2)  provides  an 
excellent  approximation  to  Equation  (1).  It  is  important  to 
note  the  large  errors  in  slant  estimation  that  would  occur  if 
there  were  no  compensation  for  the  effects  of  cyclovergence. 
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The  visual  system  could  in  principle  compensate 
for  cyclovergence  and  horizontal  vergence  by  use  of 
extra-retinal  signals.  In  particular, 

S  ~  -tan“^[Ttan(Hjj  +f)]  (1.5) 


where  f  is  an  extra-retinal,  cyclovergence  signal  and 
ju  is  the  horizontal  vergence  and  could  be  measured  by 
an  extra-retinal  vergence  signal.  If  the  extra-retinal, 
cyclovergence  signal  is  accurate,  f  =  r.  To  our 
knowledge,  there  is  no  evidence  that  an  extra-retinal 
torsion  signal  exists  (see  Nakayama  &  Balliet,  1977), 
but  the  possibility  should  be  entertained  because  it  has 
been  shown  that  extra-retinal  signals  of  horizontal 
version  and  horizontal  vergence  are  used  in 
interpreting  horizontal  disparity  patterns  (e.g.. 
Backus,  et  al,  1999;  Rogers  &  Bradshaw,  1995). 

The  visual  system  could  also  compensate  for 
cyclovergence  by  use  of  vertical-shear  disparity. 
Cyclovergence  and  slant  about  a  horizontal  axis 
produce  different  effects  on  the  retinal  images; 
specifically,  cyclovergence  alters  the  pattern  of  vertical 
disparities  at  the  horizontal  meridians  of  the  eyes,  but 
horizontal-axis  slant  changes  do  not  (Rogers,  1992; 
Howard,  Ohmi,  &  Sun,  1993;  Howard  &  Kaneko, 
1994).  This  is  illustrated  in  the  middle  and  lower 
panels  of  Figure  1.7. 

Vertical-shear  disparity  (V^)  can  be  defined  as  the 
angle  between  the  projections  of  a  horizontal  line  in 
the  two  eyes  (lower  panel.  Figure  1.7).  Slant  about  the 
horizontal  axis  is  given  to  close  approximation  by: 


S  (1.6) 


So  the  visual  system  could,  in  principle,  estimate 
slant  even  when  the  eyes  are  torsionally  misaligned  by 
measuring  H^,  V^,  and  distance.  This  equation  predicts 
that  changes  in  perceived  slant  can  be  induced  by 
altering  or  and  such  an  effect  has  been 
demonstrated  by  Ogle  and  Ellerbrock  (1946),  Howard 
and  Kaneko  (1994),  and  others. 

There  is,  of  course,  a  variety  of  monocular  slant 
signals  that  can  be  used  to  estimate  slant  about  a 
horizontal  axis.  The  most  obvious  such  signal  is  the 
texture  gradient  which  can  be  used  to  estimate  surface 
slant  and  tilt  (Gibson,  1950;  Knill,  1998).  The  utility  of 
the  texture  gradient  is  unaffected  by  cyclovergence 
and  horizontal  vergence,  so  the  visual  system  would 
not  have  to  compensate  for  vergence  changes  when 
using  this  slant  signal  to  estimate  local  surface 
orientation.  We  were  able  to  eliminate  the  influence  of 
these  signals  in  the  work  presented  here,  so  we  focus 
only  on  disparity  and  extra-retinal  signals. 


There  is  clear  experimental  evidence  that  the 
visual  system  can  use  both  extra-retinal  signals  and 
patterns  of  vertical  disparity  to  compensate  for 
changes  in  horizontal  vergence.  Thus,  we  will  focus 
here  on  cyclovergence.  There  are  three  possible  means 
of  compensation. 

1.  Perhaps  compensation  does  not  occur,  so 
cyclovergence  changes  lead  to  errors  in  slant 
estimation  like  those  shown  in  Figure  3.  We 
will  refer  to  this  as  the  noncompensation  model. 

It  is  represented  quantitatively  by  equations 
(1.3)  and  (1.4). 

2.  Perhaps  compensation  occurs  via  use  of  an 

extra-retinal  torsion  signal.  We  will  refer  to 
this  as  the  extra-retinal-compensation  model.  It  is 
represented  quantitatively  by  equation  (1.5). 

3.  Perhaps  compensation  occurs  via  use  of 
vertical-shear  disparity.  We  will  refer  to  this 
as  the  vertical-disparity-compensation  model  and 
it  is  represented  by  equation  (1.6). 

Usually,  greater  slant  is  perceived  in  stereo-defined 
surfaces  when  slant  is  about  the  horizontal  axis  as 
opposed  to  the  vertical  axis  (Rogers  &  Graham,  1983; 
Mitchison  &  McKee,  1990;  Gillam  &  Ryan,  1992; 
Buckley  &  Frisby,  1993).  Because  the  signals  involved 
are  so  different  for  slant  estimation  about  the  horizontal 
axis  than  for  estimation  about  the  vertical  axis,  there  is 
a  variety  of  possible  explanations  for  this  so-called  slant 
anisotropy.  Comparing  the  results  from  the  horizontal 
axis  experiments  with  our  previous  work  (e.g..  Backus 
et  al,  1999)  will  help  delineate  the  critical  differences. 

In  the  experiments  we  varied  cyclovergence  and 
vertical-shear  disparity  independently  to  determine 
whether  the  two  estimation  methods  exist  and,  if  so, 
how  their  outputs  are  combined.  The  experimental 
procedure  is  depicted  in  Figure  1.9.  We  induced 
cyclovergence  with  a  conditioning  stimulus  composed 
of  horizontal  lines;  the  lines  were  rotated  in  opposite 
directions  in  the  two  eyes.  We  measured  cyclovergence 
response  using  a  nonius  technique.  The  nonius  figure 
(upper  right  panel)  was  flashed  and  the  observer 
judged  whether  the  lines  were  subjectively  parallel.  We 
validated  the  nonius  technique  by  using  3-D  search 
coils  in  van  den  Berg's  lab  in  Rotterdam.  Observers 
performed  the  nonius  task  while  eye  position  was 
measured.  Nonius  and  objective  measures  agreed 
closely  (Hooge  et  al,  2000). 

In  Experiment  2  (we  don't  describe  Experiment  1 
here),  the  stimulus  used  to  measure  perceived  slant 
was  a  large  random-dot  plane;  the  dots  were  back- 
projected  to  render  nonstereo  slant  signals 
iminformative.  Different  amoimts  of  vertical-shear 
disparity  were  added  to  the  stimulus.  The  plane  was 
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flashed  and  the  observer  adjusted  horizontal-shear 
disparity  until  the  plane  appeared  gaze  normal  (lower 
right  panel).  The  procedure  cycled  between  the 
conditioning  stimulus  (2  sec),  nonius  figure  (100  msec), 
conditioning  stimulus  (2  sec),  test  stimulus  (100  msec), 
and  so  forth  xmtil  the  observer  was  satisfied  with  both 
settings.  By  using  this  procedure,  we  knew  the  eyes' 


Figure  1.9,  Experimental  procedure.  Black  lines  represent  left 
eye's  image  and  gray  lines  right  eye's  image.  Conditioning 
stimulus  (left)  is  presented  to  induce  cyclovergence.  Nonius 
technique  (upper  right)  is  used  to  measure  cyclovergence; 
observers  adjust  orientation  of  upper  line  until  subjectively 
parallel  to  lower.  Gaze-normal  task  (lower  right)  is  used  to 
measure  slant  percepts.  Observers  adjust  HSh  until  random- 
dot  plane  appears  gaze  normal. 

Predictions  for  the  gaze-normal  task  are  represented 
by  the  diagonal  lines  in  Figure  1.10.  Cyclovergence 
response  is  plotted  on  the  abscissa  and  horizontal-shear 
disparity  (at  the  retinae)  on  the  ordinate.  If  no 
compensation  occurred  for  changes  in  cyclovergence, 
gaze-normal  settings  would  be  predicted  by  Equation 
(1.4);  the  data  would  lie  on  the  horizontal  line.  If 
complete  compensation  occurred  based  on  vertical- 
shear  disparity  (Equation  1.6),  the  data  would  lie  on  the 
five  diagonal  lines  (one  for  each  amount  of  added 
vertical  disparity).  If  complete  compensation  occurred 
based  on  eye-position  signals  (Equation  1.5),  data 
would  lie  on  the  central  diagonal  line. 

Data  from  one  of  the  three  observers  are  shown  in 
the  right  panel  of  Figure  8.  The  data  are  clearly  most 
consistent  with  compensation  by  vertical-shear 
disparity  (Eqn  1.6).  Data  from  the  other  two  observers 
were  quite  similar. 


Experiment  2; 
Large  Diameter 


Cyclovergence  (deg) 


Figure  1.10.  Experiment  2  results  for  observer  ITH.  The 
horizontal-shear  disparity  that  appeared  gaze  normal  is 
plotted  as  a  function  of  cyclovergence.  Horizontal  shear  is  in 
retinal  coordinates.  Stimulus  diameter  was  35  deg.  Vertical- 
shear  disparity  (in  head-centric  coordinates)  was  -4,  -*2,  0, 2, 
or  4  deg;  each  is  represented  by  a  different  data  symbol. 
Vertical  shear  at  the  retinas  was  the  sum  of  the  vertical  shear 
added  to  the  stimulus  plus  the  effect  of  cyclovergence.  If  no 
compensation  for  cyclovergence  occurred,  the  data  would 
lie  on  the  horizontal  line.  If  veridical  compensation  based  on 
use  of  vertical-shear  disparity  occurred  (Equation  1.6),  the 
data  would  lie  on  the  diagonal  lines.  If  veridical 
compensation  based  on  use  of  an  extra-retinal, 
cyclovergence  signal  occurred  (Equation  1.5),  the  data 
would  lie  on  the  central  diagonal  line.  Each  data  point 
represents  one  setting. 

Experiments  3  and  4  were  designed  to  determine 
whether  there  is  any  influence  of  extra-retinal,  torsion 
signals.  In  Experiment  3,  we  reduced  the  diameter  of 
the  random-dot  plane  to  5  deg;  this  made  vertical-shear 
disparity  difficult  to  measure.  In  Experiment  4,  the 
stimulus  was  a  single  smooth  vertical  line;  this  makes 
vertical-shear  disparity  impossible  to  measure  because 
there  are  no  vertically  separated  features.  In  both  cases, 
we  found  a  complete  failure  to  compensate  for  the 
eyes'  cyclovergence.  In  other  words,  when  we  induced 
cyclovergence  changes,  the  observer  saw  different 
slants  in  the  stimulus.  It  appears  then  that 
compensation  for  cyclovergence  is  mediated  only  by 
use  of  vertical  disparities.  We  found  no  evidence  for 
use  of  an  extra-retinal  signal. 
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During  the  past  year,  we  completed  two  more 
manuscripts  on  visual  space  perception. 

One  was  an  investigation  into  the  perception  of  slant 
with  real-world  objects.  In  collaboration  with  Fiona 
James,  Keith  Humphrey  and  Tutis  Vilis  of  the 
University  of  Western  Ontario,  we  examined  how 
people  take  eye  position  into  account  when  asked  to 
judge  the  slant  of  a  surface  in  world  coordinates.  To 
judge  slant  relative  to  the  world,  the  nervous  system 
must  measure  surface  slant  relative  to  the  line  of  sight 
(oculocentric  slant),  eye  position  relative  to  the  head, 
and  head  position  relative  to  the  world  coordinates. 
We  showed  two  things:  1)  people  are  quite  good  at 
judging  object  slant  in  world  coordinates  and  2)  their 
errors  are  the  outcome  of  errors  in  all  three 
measurements.  This  work  is  currently  in  press  in  Vision 
Research  (James,  Whitehead,  Humphrey,  Banks,  &  Vilis, 
2001). 

The  other  manuscript  reports  an  investigation  into 
the  means  by  which  we  estimate  the  horizontal 
eccentricity  of  an  object  relative  to  the  head.  There  are 
two  quite  different  methods  by  which  the  nervous 
system  could  estimate  the  head-centric  eccentricity  of 
an  object. 

The  first  method  is  the  obvious  one.  Measured 
from  the  cyclopean  eye,  the  horizontal  eccentricity  or 
azimuth  of  an  object  point  is  given  by  the  average  of 

and  Oj^;  this  quantity  is  called  the  horizontal  version 
of  the  eyes,  y.  Thus,  an  observer  can  in  principle 
estimate  a  fixated  object's  azimuth,  a,  from  y.  If  the 
observer  is  not  fixating  the  object,  the  azimuth  is  the 
sum  of  the  retinal  image  eccentricity  (r;  which  is  the 
average  of  the  retinal  eccentricities  in  the  two  eyes) 
and  the  version:  a  =  r  +  y.  Azimuth  is,  therefore,  given 
by: 


a  =  f+y  (1.7) 

where  the  hats  signify  measurements  of  the  relevant 
quantities. 

The  second  method  is  less  obvious  and  does  not 
require  the  use  of  eye-position  signals.  If  an  object  is  to 
the  left  of  the  head's  median  plane,  it  is  guaranteed  to 
be  taller  in  the  left  eye  than  in  the  right  eye.  If  it  is  to 
the  right  of  the  median  plane,  it  will  be  taller  in  the 
right  eye.  In  this  manuscript  we  showed  that  one  could 
in  principle  estimate  head-centric  eccentricity  in  the 
following  way. 


a  -  tan 


InVSR 
.  M 


(1.8) 


where  VSR  is  the  vertical  size  ratio  (a  measure  of 
vertical  disparity)  and  ju  is  the  eyes'  vergence  (which 
can  be  obtained  from  retinal  information  alone). 


We  asked  which  method  the  visual  system  uses  in 
estimating  the  horizontal  eccentricity  of  an  object.  On 
each  trial,  a  large,  random-dot  surface  was  presented  at 
a  chosen  horizontal  eccentricity.  The  observers  had  to 
turn  the  eyes  by  different  amoimts  (different  horizontal 
versions)  to  fixate  the  center  of  the  surface.  The  eye- 
position-specified  azimuth  of  the  stimulus  was  how  far 
the  eyes  had  to  turn  left  or  right;  the  vertical-disparity- 
specified  azimuth  was  the  pattern  of  vertical  disparities 
presented  to  the  eyes  (taller  in  the  left  eye  when  the 
disparity-specified  azimuth  was  to  the  left).  Observers 
pointed  in  the  perceived  direction  of  the  center  of  the 
stimulus  with  an  invisible  pointer  held  by  the  two 
hands.  The  data  from  one  observer  are  presented  in 
Figure  1.11  (the  other  observers  yielded  very  similar 
data).  The  azimuth  of  the  pointing  response  is  plotted 
as  a  function  of  the  eye-position-specified  azimuth  of 
the  stimulus.  The  different  symbols  represent  data  for 
different  disparity-specified  azimuths.  Clearly,  the 
eye-position-specified  azimuth  was  the  sole 
determinant  of  perceived  direction.  Thus,  the  method 
suggested  by  Eqn.  1.7  seems  to  be  employed  by  the 
nervous  system. 


Eye-position  Azimuth  (deg) 

Figure  1.11.  Response  azimuth  as  a  function  of  eye-position- 
specified  azimuth  for  a  viewing  distance  of  19  cm.  The 
azimuths  of  the  observer's  responses  are  plotted  as  a  function 
of  the  eye-position-specified  azimuth.  Different  disparity- 
specified  azimuths  are  represented  by  different  symbols.  The 
error  bars  indicate  +/— 1  standard  deviation. 


The  manuscript  describing  this  work  is  in  press  at 
Vision  Research  (Banks,  Backus,  &  Banks,  2001). 


2.  Heading  Perception 


We  have  also  continued  our  research  on  the 
perception  of  self-motion.  In  the  previous  review 
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period,  five  publications  appeared  from  this  project: 
Crowell  et  al  (1998),  Ehrlich  et  al  (1998),  Freeman  and 
Banks  (1998),  Freeman  (1999),  and  Freeman  et  al  (2000). 
In  this  last  grant  period,  we  have  completed  the 
theoretical  analysis  and  experiments  on  ano^er  project 
that  was  reported  at  ARVO  (Sibigtroth  &  Banks,  2000). 
We  also  completed  construction  of  our  3-axis  rotating 
chair  in  which  we  are  conducting  visual-vestibular 
research  that  is  relevant  to  spatial  disorientation  in 
aviation.  We  are  approximately  half-way  through  the 
first  set  of  experiments  on  visual-vestibular 
interactions.  We  presented  preliminary  data  at  VSS 
(Sibigtroth  &  Banks,  2001). 

As  in  the  previous  section,  we  will  first  review  the 
background  material  for  this  research  project  before 
moving  onto  the  particular  experiments  and  analyses 
that  were  completed.  Much  of  this  backgroimd  section 
also  appeared  in  the  previous  progress  report,  so  if  you 
read  it,  you  may  want  to  skip  ahead  to  page  11. 

As  a  person  moves  through  the  environment, 
images  move  across  the  retina,  the  eyes  move  relative 
to  the  head,  the  head  turns  relative  to  the  body,  and  the 
body  translates  and  rotates  through  space.  Despite  this 
complex  of  various  motions,  the  nervous  system 
produces  a  coherent  percept  of  the  person's  motion 
relative  to  environmental  landmarks.  From  this 
percept,  the  human  observer  is  able  to  move  toward 
targets,  avoid  obstacles,  and  guide  complicated 
perceptual-motor  behavior.  We  have  been  examining 
how  the  nervous  system  accomplishes  this.  Our  work 
has  examined  the  analysis  of  visual  signals,  eye- 
velocity  signals,  head-velocity  signals,  and  more. 

We  continued  our  work  on  the  use  of  various 
signals  to  estimate  the  direction  of  self-motion.  The 
problem  we  examined  in  the  rotation  problem,  so  we 
begin  with  a  description  of  that  problem,  followed  by  a 
brief  literature  review,  and  then  by  a  description  of  our 
work  during  the  grant  period. 

As  we  move  through  the  environment,  the  retinal 
image  of  that  environment  changes  in  predictable 
ways.  For  example,  if  we  move  in  a  straight  line  our 
self-motion  produces  a  radial  pattern  of  motion  in  the 
retinal  image,  like  that  in  Figure  2.1A.  The  center  or 
focus  of  the  radial  expansion  (marked  by  a in  Figure 
15)  corresponds  to  our  direction  of  motion  (Gibson,  et 
al,  1955).  Re-creating  this  pattern  of  retinal-image 
motion  by  viewing  a  film  or  computer  display 
depicting  our  forward  motion  can  cause  a  compelling 
sensation  that  we  are  in  fact  moving  forward  (Howard, 
1982),  and  under  a  variety  of  conditioris  we  can 
accurately  estimate  where  we  are  going  in  the 
simulated  scene  (Warren  et  al,  1988;  Royden  et  al, 
1992). 
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Figure  2,1.  Retinal  flow  fields  for  two  viewing  situations.  A) 
Forward  translation  without  a  gaze  rotation.  Observer  is 
fixating  in  constant  direction  and  heading  toward  the  cross. 
B)  Forward  translation  while  making  a  gaze  rotation. 
Observer  is  making  a  rightward  eye  movement. 

When  we  smoothly  shift  gaze  direction  by  turning 
the  eye  or  head  (e.g.  to  look  at  a  moving  object  or  a 
stationary  object  to  the  side)  while  still  moving  in  a 
straight  line,  the  pattern  of  retinal-image  motion  is 
more  complex  (Figure  2.1B).  We  can  re-create  this  type 
of  retinal  motion  pattern  by  having  observers  hold  the 
eyes  still  while  viewing  a  display  that  simulates  both 
their  forward  motion  and  an  eye  movement.  In  this 
case,  observers  report  that  they  are  moving  along  a 
curved  trajectory  (as  though  turning  a  car  while 
looking  forward  through  the  windshield)  rather  than 
along  the  depicted  linear  path.  When  they  are  asked  to 
adjust  the  position  of  a  marker  in  front  of  them  imtil  it 
appears  to  sit  upon  their  future  path,  their  responses 
are  strongly  biased  in  the  direction  of  the  perceived 
path  curvature  (Royden  et  al,  1992,  1994;  Banks  et  al, 
1996;  van  den  Berg,  1996).  On  the  other  hand,  self- 
motion  judgments  are  quite  accurate  when  the  identical 
pattern  of  retinal  image  motion  is  created  by  having 
observers  view  a  display  like  that  in  Figure  2.1A  while 
turning  the  eye  to  pursue  a  target  that  moves  across  it 
(Royden  et  al,  1992,  1994;  Banks  et  al,  1996;  van  den 
Berg,  1996).  Observers  typically  report  that  they  appear 
to  be  moving  on  a  straight  rather  than  a  curved  path 
(Royden,  1994).  In  this  case,  the  observer's  visual 
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system  has  extra-retinal  information  about  the  eye 
movement,  probably  consisting  mainly  of  an  efference 
copy  of  the  motor  command  to  turn  the  eye  (Howard, 
1982).  The  visual  system  uses  this  information  to 
compensate  for  the  effects  of  the  eye  movement  on  the 
retinal  motion  pattern;  previous  research  using  self- 
motion  judgments  indicates  that  this  compensation  is 
nearly  complete. 

We  completed  an  investigation  of  how  perspective 
transformations  affect  humans'  ability  to  estimate  self- 
motion.  The  first  set  of  experiments  was  presented  at 
ARVO  (Sibigtroth  &  Banks,  2000).  The  optic  flow  field 
created  by  self-motion  through  a  rigid  envirorunent  is 
an  important  cue  to  direction  of  self  motion,  but  it  is 
not  the  only  visual  cue.  Consider,  for  example,  the  case 
when  you  walk  by  a  rectilinear  frame  (depicted  in 
Figure  2.2).  If  you  pass  to  the  left  side  of  the  frame,  you 
will  see  the  left  side  grow  more  in  visual  angle  than  the 
right  side.  If  you  pass  to  the  right,  the  opposite  will 
occur.  Can  the  visual  system  take  advantage  of  this 
perspective  information  (assuming  that  the  frame  is 
indeed  rectilinear)  to  estimate  the  heading?  Jeremy 
Beer  at  Brooks  AFB  had  a  similar  insight  a  few  years 
ago  and  showed  the  people  are  sensitive  to  this 
information.  During  the  grant  period,  we  worked 
through  the  mathematics  and  showed  how  this 
perspective  transformation  information  could  be  used 
to  determine  the  direction  of  self-motion.  We  then 
conducted  experiments  (presented  at  ARVO)  that 
showed  that  human  observers  do  indeed  use  this 
information  to  estimate  heading.  This  summer  we  are 
completing  some  control  experiments  before  writing 
the  work  up  for  publication. 

During  the  grant  period,  we  completed  construction 
of  our  three-axis  rotating  chair.  We  began  to  investigate 
how  stimulation  of  the  otoliths  (the  parts  of  the 
vestibular  apparatus  that  signal  linear  acceleration) 
affects  the  perception  of  heading.  Such  investigations 
are  clearly  relevant  to  understanding  visual-vestibular 
illusions  that  occur  in  aviation  such  as  the  pitch-up 
(somatogravic)  illusion  and  the  bank  illusion  (which 
can  lead  to  the  death  spiral). 

We  presented  subjects  optic  flow  displays 
simulating  forward  translation  and  a  gaze  rotation  (see 
Figure  2.1B).  Normally,  observers  say  they  perceive 
curvilinear  self-motion  with  such  displays.  We  found, 
however,  if  we  rolled  observers  to  simulate  correct  or 
incorrect  centrifugal  force,  we  could  bias  their  percepts 
of  self-motion  path. 


Figure  2.2.  Upper  panels;  Plan  views  of  two  different  self- 
motion  paths.  The  paths  are  represented  by  the  successive 
positions  of  the  eyes.  The  gaze  directions  are  represented  by 
the  arrows  from  the  eyes.  The  rectilinear  frame  is  shown  in 
the  upper  part  of  both  panels.  Other  panels  show  the  view 
sequences  associated  with  those  two  paths.  Notice  the  change 
in  the  projected  shape  of  the  rectilinear  frame. 

We  also  investigated  the  role  of  otolith  signals  in  the 
estimation  of  self-motion  paths.  As  we  explained 
above,  an  observer  on  a  linear  path  making  a  smooth 
horizontal  eye  movement  to  the  right  creates  optic  flow 
that  is  very  similar  to  an  observer  on  a  curvilinear  path 
with  the  curvature  to  the  right  (Royden,  1994).  As  we 
have  shown  earlier,  observers  tend  to  see  curvilinear 
paths  when  optic  flow  consistent  with  a  linear  path  is 
presented  unless  the  gaze  rotation  is  accompanied  by 
either  an  extra-retinal,  eye-velocity  signal  or  an  extra- 
retinal,  head-velocity  signal  (e.g.,  neck  proprioceptors 
and  semi-circular  canals).  Notice  that  if  an  observer  is 
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on  a  curvilinear  path  (left  side  of  Figure  2.3),  the 
centripedal  acceleration  to  the  right  creates  leftward 
centrifugal  force  (labeled  "a"  in  the  middle  of  Figure 
2.3).  An  observer  on  a  linear  path  making  a  gaze 
rotation  to  the  right  will  not  experience  this  centrifugal 
force.  If  the  otoliths  sensed  the  centrifugal  force,  they 
could  aid  in  the  disambiguation  of  linear  vs  curvilinear 
self-motion.  We  tested  this  hypothesis  during  the  grant 
review  period. 

We  completed  the  software  for  our  3-axis  rotating 
chair  so  we  can  rotate  the  observer  about  the  pitch,  roll, 
and  yaw  axes  while  he/she  views  motion  sequences  on 
a  projection  screen  that  moves  with  the  chair.  To 
determine  whether  the  otolith  signals  are  used  in  the 
estimation  of  self-motion  paths  we  presented  optic  flow 
sequences  consistent  with  linear  paths  (with  simulated 
gaze  rotation)  or  curvilinear  paths  while  the  observer 
either  sat  upright  or  was  rolled  to  simulate  centrifugal 
force  (right  side  of  Figure  2.3).  We  asked  the  observers 
to  report  on  their  future  perceived  position  relative  to  a 
landmark  that  appear  in  the  visual  scene  at  the  end  of 
the  motion  sequence. 

Observer  path  Forces  produced  Forces  simulated 


Figure  2.3.  Scenarios  involved  in  curved  self-motion  paths. 
Left:  circular  path  curving  to  the  right.  Observer's  gaze 
direction  maintains  constant  relationship  to  the  path,  so  it 
rotates  over  time.  Middle:  forces  created  on  circular  path.  The 
gravitational  force  is  represented  by  g  and  the  outward 
centrifugal  force  by  a.  The  net  force  is  the  vector  sum,  g  +  a, 
which  is  a  force  at  angle  0  relative  to  the  head.  Right:  we  can 
simulate  this  situation  by  rolling  the  observer's  head  through 
angle  0. 

The  results  are  shown  in  Figure  2.4.  The  panel  on 
the  left  shows  data  when  the  motion  sequence 
simulated  linear  paths  and  the  one  on  the  right  shows 
data  when  the  sequence  simulated  curvilinear  paths. 
The  graphs  plot  the  error  in  the  perceived  path  as  a 
function  of  the  gaze  rotation  rate.  If  performance  were 
veridical,  the  data  would  fall  on  the  horizontal  dashed 
line. 

When  a  linear  path  was  presented  (left  panel), 
observers  reported  more  curved  paths  when  they  were 
rolled  ("otolith  inconsistent";  circular  data  symbols) 
than  when  they  were  upright  ("otolith  consistent"; 
square  symbols).  When  a  curvilinear  path  was 
presented  (right  panel),  observers  again  reported  more 
curved  paths  when  they  were  rolled  ("otolith 
consistent";  circles)  than  when  they  were  not  ("otolith 
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inconsistent";  squares).  These  results  show  that  the 
otolith  signal  does  indeed  affect  heading  judgments. 
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Figure  2.4.  Results  from  visual-otolith  experiment.  Both 
panels  plot  the  error  in  the  perceived  path  (in  deg)  as  a 
function  of  the  gaze  rotation  rate.  Left  panel:  the  optic  flow 
sequence  simulated  a  linear  path  with  gaze  rotation.  If 
responses  were  veridical,  the  data  would  lie  on  the 
horizontal  dashed  line.  When  the  observer  was  rolled  (otolith 
inconsistent;  circles),  he/she  reported  more  path  curvature 
than  when  he /she  was  not  rolled  (otolith  consistent; 
squares).  Right  panel:  the  optic  flow  sequence  simulated  a 
circular  path  with  gaze  rotation  due  to  the  path  curvature.  If 
responses  were  veridical,  the  data  would  lie  on  the 
horizontal  dashed  line.  When  the  observer  was  rolled  (otolith 
consistent;  circles),  he/she  reported  more  path  curvature 
than  when  he/she  was  not  rolled  (otolith  inconsistent; 
squares). 

3.  Visual-Haptic  Integration 

During  the  grant  period,  we  purchased  equipment 
and  constructed  an  apparatus  for  studying  visual- 
haptic  integration.  The  major  equipment  purchase  was 
for  two  PHANToM  force-feedback  devices  that  allow 
one  to  simulate  haptic  stimuli;  these  monies  were 
provided  by  AFOSR  in  an  equipment  supplement 
grant.  We  were  also  given  an  QnyxII  graphics 
workstation  from  Silicon  Graphics  to  serve  as  host  for 
the  PHANToMs. 

The  experimental  setup  is  schematized  in  Figure  3.1. 
The  visual  display  is  viewed  in  a  mirror  placed  above 
the  hand.  The  index  finger  and  thumb  of  the  right  hand 
are  placed  in  separate  haptic  feedback  devices 
(depicted  beneath  the  mirror).  The  PHANToMs 
feedback  devices  provided  force  to  the  finger  and 
thumb  and  thereby  simulated  a  virtual  object  or 
surface. 
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Figure  3.1.  The  experimental  setup  for  the  visual-haptic 
integration  experiments.  The  visual  stimulus  is  presented  in 
the  display  screen  above  and  viewed  in  the  mirror  placed 
above  the  hand.  The  index  finger  and  thumb  of  the  right 
hand  are  placed  in  PHANToMs  haptic  feedback  devices.  The 
observer  touches  and  grasps  virtual  objects  in  the  workspace. 
Haptic  feedback  creates  the  sensation  of  touching  a  real 
object  or  surface. 

During  the  grant  period  before  this  one,  we 
completed  an  experiment  on  the  use  of  haptic 
information  to  set  ^e  weights  given  to  different  visual 
cues.  This  work  appeared  in  Nature  Neuroscience  (Ernst, 
Banks,  &  Buelthoff,  2000).  It  was  conducted  in 
Germany.  During  this  last  grant  period,  we  completed 
an  experiment  on  visual-haptic  integration  and  the 
manuscript  describing  this  work  was  recently  accepted 
in  Nature  (Ernst  &  Banks,  2001).  We  describe  this 
experiment  here. 

When  a  person  looks  at  an  object  while  exploring  it 
with  the  hand,  vision  and  touch  both  provide  useful 
information  for  estimating  the  object's  properties. 
Frequently,  vision  dominates  the  integrated,  visual- 
haptic  percept — such  as  when  judging  size,  shape,  or 
position  (Rock  &  Victor,  1964) — ^but  in  some 
circumstances,  the  percept  is  clearly  affected  by  haptics 
(Power,  1980).  If  the  human  observer  uses  vision  and 
haptics  to  estimate  an  environmental  property  (e.g.,  an 
object's  size),  it  would  be  sensible  to  do  it  in  a  way  that 
minimizes  error  m  the  final  estimate.  This  general 
principle — ^minimizing  variance  in  the  final  estimate — 


can  be  realized  by  using  maximum-likelihood 
estimation  (MLE;  Landy  et  al,  1995;  Gharamani  et  al, 
1997)  to  combine  the  inputs. 

A  sensory  system's  estimate  of  an  environmental 
property  can  be  represented  by: 


5,  =  /;.(5)  (3.1) 

where  S  is  the  physical  property  being  estimated  and/ 
the  operation  by  which  the  nervous  system  does  the 
estimation.  The  subscripts  refer  to  die  modality  (z 
could  also  refer  to  different  cues  within  a  modality). 

yv 

Each  sensor's  estimate,  5, ,  is  corrupted  by  noise.  If  the 
noises  are  independent  and  Gaussian  with  variance 
rr.  and  the  Bayesian  prior  is  uniform,  MLE  of  the 
environmental  property  is  given  by: 


5  =  ^>v.5.  with: 


(3.2). 


J 

Thus,  the  MLE  rule  states  that  the  optimal  means  of 
estimation  is  to  add  the  sensor  estimates  weighted  by 
their  normalized  reciprocal  variance  (Landy  et  al, 
1995).  If  the  MLE  rule  is  used  to  combine  visual  and 


haptic  estimates,  Sy  and  5^  ,  the  variance  of  the  final 


(visual-haptic)  estimate,  S ,  is: 


'VH 


^2  _2 


(3.3). 


Thus,  the  final  estimate  has  lower  variance  than  either 
the  visual  or  haptic  estimator. 

Here  we  examine  visual-haptic  integration 
quantitatively  to  determine  whether  human 
performance  is  optimal.  Observers  looked  at  and/or 
felt  a  raised  ridge  and  judged  its  height  (vertical 
extent).  To  work  out  the  predictions  of  the  MLE  rule, 
we  first  determined  the  variances  of  the  visual  and 
haptic  height  estimates  (within-modaUty)  by 
conducting  discrimination  experiments.  In  the  haptic- 
alone  experiment,  observers  indicated  which  of  two 
sequentially  presented  ridges  was  taller  from  haptic 
information  alone;  in  the  visual-alone  experiment, 
they  did  the  same  from  vision  alone.  There  were  four 
conditions  in  the  visual  experiment  differing  in  the 
amount  of  noise  in  the  stimulus.  By  adding  noise  we 
made  the  visually  specified  height  less  reliable. 
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a  Wlthin-Modallty  Discrimination 


Figure  3.2A.  Within-modality  discrimination  data.  The 
proportion  of  trials  in  which  the  comparison  stimulus  was 
perceived  as  taller  than  the  standard  stimulus  is  plotted  as  a 
function  of  the  comparison's  height.  Data  have  been 
averaged  across  the  four  observers.  The  standard's  height 
was  always  55mm.  The  haptic  discrimination  data  are 
represented  by  the  x's  and  the  best-fitting  cumulative 
Gaussian  by  the  dashed  curve.  The  visual  discrimination 
data  are  represented  by  the  four  open  symbols  and  solid 
curves.  Each  corresponds  to  a  different  level  of  visual  noise. 

Fig.  3.2a  shows  the  visual-alone  and  haptic-alone 
discrimination  data.  The  proportion  of  trials  in  which 
the  observer  indicated  that  the  comparison  stimulus 
(variable  height)  appeared  taller  than  the  standard 
stimulus  (fixed  height  of  55mm)  is  plotted  as  a 
function  of  the  comparison's  height.  The  dashed  line 
and  symbols  represent  the  haptic  discrimination  data 
and  the  solid  curves  with  open  symbols  represent  the 
visual  data  for  the  four  levels  of  noise.  These 
psychometric  fimctions  were  well  fit  by  cumulative 
Gaussians.  Discrimination  threshold  is  defined  as  the 
difference  between  the  PSE  and  height  of  the 
comparison  stimulus  when  it  is  judged  taller  than  the 
standard  84%  of  the  time.  The  84%  points  correspond 

to  V2  times  the  standard  deviation  of  the  underlying 
estimator.  The  haptic  discrimination  threshold  was 
approximately  0.085  times  the  average  ridge  height 
(which  was  55mm).  As  the  noise  increased  from  0%  to 
200%,  the  visual  discrimination  thresholds  increased 
from  0.04  to  0.2  times  the  average  height.  Thus,  when 
the  visual  noise  was  0%,  the  visual  discrimination 
threshold  was  roughly  half  the  haptic  threshold;  when 
the  visual  noise  was  200%,  the  visual  threshold  was 
more  than  double  the  haptic  threshold. 

In  the  visual-haptic  experiment,  observers 
simultaneously  looked  at  and  felt  two  raised  ridges 
that  were  presented  sequentially.  In  one  presentation, 
the  visually  and  haptically  specified  heights  were 
equal  {comparison  stimulus),  in  the  other  presentation. 


they  differed  {standard  stimulus).  The  difference  in  the 
specified  heights  was  d=±6,  ±3,  or  0mm  (average  of 
and  Sy  was  55mm).  For  each  d  in  the  standard 
(randomly  presented),  the  height  of  the  comparison 
stimulus  was  varied  (47-63mm)  randomly  from  trial  to 
trial.  On  each  trial,  the  observer  indicated  which 
stimulus  seemed  taller. 


b  VIsuai-Haptic  Discrimination 


Figure  3.2.B.  Visual-haptic  discrimination  data.  The 
proportion  of  trials  in  which  the  comparison  stimulus  was 
perceived  as  taller  than  the  standard  stimulus  is  plotted  as  a 
function  of  the  comparison's  height.  The  standard's  average 
height  was  always  55mm,  but  the  difference  between  the 
visually  and  haptically  specified  heights  varied  from  -6  to 
6mm.  To  plot  the  data  on  one  set  of  coordinates,  we  shifted 
the  psychometric  functions  laterally  by  w^All.  The  four  sets 
of  symbols  correspond  to  different  levels  of  visual  noise. 


Fig.  3.2b  shows  the  proportion  of  trials  in  which 
the  comparison  stimulus  was  chosen  as  taller  as  a 
function  of  the  comparison's  height.  From  those 
psychometric  fimctions,  we  obtained  the  point  of 
subjective  equality  (PSE) — the  comparison  height 
appearing  equal  to  the  standard  height — and  the  just- 
discriminable  change  in  height  (threshold). 

Using  the  within-modality  data,  we  can  now 
predict  what  an  observer  using  MLE  would  do  when 
presented  visual  and  haptic  information 
simultaneously  and  then  compare  those  predictions  to 
the  performance  in  the  visual-haptic  experiment. 

We  first  describe  the  analysis  of  the  PSE  data  and 
predictions  for  the  weights.  From  Eqn.  (3.2)  and  the 
relationship  between  threshold  and  estimator 
variance: 


where  and  Ty  are  the  haptic  and  visual  thresholds 
(84%  points  in  Fig.  3.2a).  Incorporating  the 
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normalization  assumption  (Wy  =1)/  the 
predicted  weights  for  optimal  integration  are: 


= 


Ty^+T^ 


fj^2 

(3.4). 

ly 


Figure  3.2.C.  Observed  and  predicted  weights  and  PSEs  in 
visual-haptic  judgments.  The  abscissa  is  the  amount  of 
visual  noise.  The  left  ordinate  is  the  visual  weight.  The  right 
ordinate  is  the  PSEs  in  relation  to  SV  and  SH.  The  purple 
symbols  represent  the  observed  visual  weights  based  on  the 
observers'  PSEs  in  the  visual-haptic  experiment  (Fig  3b;  Eqn 
3.5).  The  shaded  area  represents  the  weights  expected  from 
the  within-modality  discrimination  data  (Fig  3a;  Eqn  3.4). 

The  predicted  visual  weights  are  represented  by 
the  curve  and  shaded  surround  in  Fig.  3.2c.  The 
predicted  weights  vary  significantly  with  the  amount 
of  visual  noise  in  the  stimulus:  the  visual  weights  are 
higher  when  the  noise  level  is  low  and  lower  when  the 
noise  level  is  high.  Assuming  that  the  visual  and 

haptic  estimators  are  on  average  imbiased  (5^  =  Sy 

and  5^  =  5^ ),  the  weights  can  be  derived 
experimentally: 

w^=^(PSE-SJ/(Sy-S^)  (3.5) 

where  PSE  is  the  height  of  the  comparison  stimulus 
that  matched  the  apparent  height  of  the  standard.  The 
visually  and  haptically  specified  heights  in  the 
standard — Sy  and  — are  indicated  on  the  right 

ordinate.  Fig.  3.2c  shows  that  as  the  noise  level  was 
increased,  the  visual  weight  decreased,  and  the  PSE 
shifted  from  Sy  toward  S^.  Because  the  noise  level 
varied  randomly  from  trial  to  trial,  the  weights  must 
have  been  set  within  the  1-sec  stimulus  presentation. 
In  the  discussion  we  suggest  a  mechanism  for  such 
dynamic  weight  adjustment.  To  summarize,  the 
predicted  and  observed  PSEs  are  quite  similar 


suggesting  that  humans  do  combine  visual  and  haptic 
information  in  a  fashion  similar  to  MLE  integration. 

We  now  turn  to  the  analysis  of  the  visual-haptic 
discrimination  thresholds.  According  to  the  MLE  rule, 
the  combined  estimates  should  have  lower  variance, 
and  therefore  lower  discrimination  thresholds,  than 
either  the  visual  or  haptic  estimate  alone  (Eqn.  3.3).  To 
derive  predictions  for  the  visual-haptic  discrimination 
thresholds,  we  rewrite  Eqn.  3: 


nn2rri2 

rp2  _ 

VH  rji'Z  ,  rp2 
ly  ’Tlfj 


Discrimination  Thresholds 


- -  haptic  (empirical) 
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Figure  3.2.D.  Within-modality  and  visual-haptic 
discrimination  thresholds.  The  just-noticeable  difference  in 
height  is  plotted  as  a  function  of  the  visual  noise.  The 
threshold  values  are  taken  from  the  psychometric  functions 
of  A  and  B  above;  they  correspond  with  the  difference 
between  the  PSE  and  the  comparison  height  that  was  chosen 
on  84%  of  the  trials  as  taller  than  the  standard  height.  The 
dashed  horizontal  line  represents  the  haptic-alone 
threshold.  The  open  symbols  represent  the  vision-alone 
thresholds.  The  filled  symbols  represents  the  visual-haptic 
thresholds  and  the  gray  shaded  area  the  predicted  visual- 
haptic  thresholds  (Eqn  3.6). 

Fig.  3.2d  shows  the  predicted  and  observed 
thresholds.  The  unfilled  symbols  represent  the  visual- 
alone  thresholds  and  the  dashed  line  represents  the 
haptic-alone  threshold.  The  shaded  area  represents  the 
predicted  visual-haptic  thresholds;  they  are  always 
lower  than  the  visual-alone  and  haptic-alone 
thresholds  at  the  corresponding  noise  level.  The  filled 
purple  points  represent  the  observed  visual-haptic 
discrimination  thresholds;  as  noise  level  increased,  the 
just-noticeable  difference  in  height  became  greater. 
Most  importantly,  the  predicted  and  observed  visual- 
haptic  discrimination  thresholds  are  quite  similar.  As 
with  the  PSE  data,  this  suggests  that  human  observers 
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combine  visual  and  haptic  information  in  a  fashion 
similar  to  MLE  integration. 

In  summary,  we  found  that  height  judgments 
were  remarkably  similar  to  those  predicted  by  the 
MLE  integrator.  Thus,  the  nervous  system  seems  to 
combine  conflicting  visual  and  haptic  information  in  a 
fashion  similar  to  the  MLE  rule:  visual  and  haptic 
estimates  are  weighted  according  to  their  reciprocal 
variances  (Eqn.  3.2).  Naturally,  it  is  important  to 
determine  whether  this  rule  characterizes  the 
estimation  of  other  stimulus  properties  such  as  shape, 
depth,  localization,  roughness,  and  compliance. 

The  relative  contributions  of  vision  and  haptics  to 
perceiving  such  object  properties  have  previously 
been  studied.  For  example.  Rock  and  Victor  (1964)  had 
subjects  grasp  a  square  while  looking  at  it  through  a 
distorting  lens  that  made  it  appear  rectangular.  The 
shape  of  the  unified  percept  was  determined  almost 
completely  by  vision,  so  the  phenomenon  has  been 
called  "visual  capture".  Numerous  studies  have 
replicated  visual  capture  in  shape  and  size  perception, 
depth  perception,  and  localization.  However,  visual 
capture  does  not  occur  in  the  perception  of  surface 
roughness;  rather  perceived  roughness  is  affected 
nearly  equally  by  haptics  and  vision.  Does  a  dynamic 
cue-combination  rule,  like  the  one  described  here, 
determine  the  degree  to  which  vision  or  haptics 
dominates?  The  statistically  optimal  means  of 
combining  visual  and  haptic  information — the  MLE 
rule — predicts  that  "visual  capture"  should  occur 
whenever  the  visual  estimate  of  a  property  has  much 
less  variance  than  the  haptic  estimate.  "Haptic 
capture"  should  be  observed  when  the  reverse  occurs. 
We  observed  behavior  like  visual  capture  when  the 
visual  stimulus  was  noise-free  and  behavior  similar  to 
haptic  capture  when  the  visual  stimulus  was  quite 
noisy  (Fig.  3.2c). 

4.  Software  Development 

We  have  spent  a  great  deal  of  effort  over  the 
previous  grant  periods  developing  software  for 
psychophysical  experimentation.  We  briefly  describe 
the  developments  that  continued  during  the  grant 
period.  These  include  development  of  specialized 
computer  graphics  programs,  optimized  rendering 
engines  and  tools  needed  to  generate  displays  with 
specific  spatial  and  temporal  properties.  Stereo  3D, 
texture  mapping,  high  frame  rate  animation  and  real¬ 
time  digital  image  manipulations.  We  have  also 
developed  a  suite  of  external  device  control  routines, 
sensor  and  actuator  interfaces,  drivers,  control 
algorithms,  feedback  loop  systems  (human  motor 
through  computer  sensory  channels)  and  low-level 


video  synchronization  tools  necessary  for  doing  real¬ 
time  psychophysics  experiments.  Almost  all  the  tools 
we  are  developing  are  in  form  of  MATLAB  shared 
libraries,  external  C  or  assembly  programs  interfaced  to 
and  called  from  MATLAB.  This  scheme  allows  us  to 
tap  into  powerful  high-level  programming  and  analysis 
features  of  MATLAB  while  we  implement  experiments 
that  require  our  low-level  tools  for  doing  real-time 
operations.  All  of  our  software  tools  are  made  available 
to  the  public  through  the  Bankslab  web  page 
(http:  /  /iohn.berkeley.edu/software.html).  They  are 
currently  used  by  many  labs  around  the  world. 

BitmapTools 

BitmapTools  is  an  external  MATLAB  plugin  for 
generating  high-frame  rate  animations  (highest  refresh 
rate  possible  on  the  graphics  card/monitor).  It  allows 
for  design  and  display  of  both  static  bitmaps  and 
bitmap  movies  on  Macintosh  and  Windows  NT 
platforms.  BitmapTools  is  designed  aroimd  one 
important  premise,  to  maximize  the  blitting  (RAM  to 
Video  Memory  transfer)  rate.  On  the  Macintosh, 
BitmapTools  takes  advantage  of  PowerPC  processor’s 
pipelining  architecture  through  assembly  level 
tweakings.  On  the  PC  (NT),  high-bandwidth  blitting  is 
achieved  through  hardware-accelerated  calls 
(DirectDraw).  Almost  all  modem  graphics  cards 
contain  the  necessary  hardware  for  BitmapTools.  The 
issue  with  movie  players  in  general  is  the  unreliable 
animation  frame  rate.  In  BitmapTools,  real-time  frame 
rate  is  guaranteed.  Under  normal  operations  (on  NT, 
with  no  major  background  processes),  a  1024x768 
movie  can  play  at  120  hz  without  missing  frames.  If  a 
frame  is  missed  for  some  reason,  exact  location  of  the 
frame(s)  in  the  sequence  is  reported. 

OpenGLTools 

OpenGLTools  is  a  MATLAB  external  shared  library 
(compiled  mex  file)  that  incorporates  interactive  2D/3D 
graphics  functionality  into  MATLAB.  The  main 
objective  is  to  bridge  MATLAB's  high-level 
progranuning  environment  with  the  low-level  OpenGL 
graphics  engine.  This  is  useful  because  MATLAB's  data 
types  and  syntax  are  most  natural  for  creation  of  basic 
3D  constructs,  as  well  as  hierarchical  development  and 
manipulation  of  the  more  complex  graphics  objects. 
OpenGLTools  is  augmented  by  a  rich  collection  of 
operators  and  functions  (toolboxes)  embedded  in 
MATLAB.  It  is  designed  as  a  research  tool  for  vision 
scientists  to  create  interactive  visual  stimuli  with 
precise  control  over  spatial,  luminance  and  temporal 
properties.  Some  of  the  advanced  features  include 
stereo  (anaglyph  and  LCD  shutter  glasses),  texture 
mapping,  lighting,  buffer  manipulations,  image 
processing  filters,  and  re-programmable  interactive 
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mouse  bindings.  OpenGLTools  is  available  on  MacOS, 
Windows,  and  Unix  (IRIX),  although  Windows  (NT)  is 
the  best  supported  platform. 

FlightTools 

FlightTools  is  a  flight  simulation  construction  plugin 
for  MATLAB.  Like  C^enGLTools,  it  takes  advantage  of 
hardware-accelerated  OpenGLTools  calls.  The  user  can 
define  scene  elements  in  form  of  MATLAB  matrices 
and  lists  and  specify  a  flight  path  and  camera  gaze  lists. 
Real-time  animation  of  flight  allows  interactive  control 
of  flight  parameters  such  as  pitch,  yaw  and  roll  control 
as  well  as  other  parameters  used  in  construction  of 
specific  simulation  fimctions. 
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Adaptation  to  three-dimensional  distortions  in 
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Vilis,  T.  (2001).  Accurate  slant  judgements  bsed  on 
extra-retinal  eye  position.  Vision  Research,  in  press. 

19.  Hooge,  I.T.C.,  Banks,  M.S.,  &  van  den  Berg,  A.V. 
(2001).  Subjective  and  objective  measures  of 
cyclovergence.  Vision  Research,  in  preparation. 

20.  Banks,  M.S.  Sibigtroth,  M.,  &  Backus,  B.T.  (2001). 
Perception  of  surface  curvature  from  stereopsis. 
Vision  Research,  in  preparation. 

21.  Sibigtroth,  M.  &  Banks,  M.S.  (2001).  The  influence 
of  otolith  signals  on  heading  judgments.  Vision 
Research,  in  preparation. 
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(2001).  The  influence  of  perspective  signals  on 
heading  perception.  Vision  Research,  in 
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6.  Service  for  Air  Force 

During  the  grant  period,  the  PI  was  asked  to  do  a 
few  things  that  might  potentially  benefit  the  Air  Force. 

In  1998,  he  traveled  to  Williams  AFB  in  Arizona  in 
order  to  meet  with  Byron  Pierce  and  George  Geri. 
During  that  trip,  he  consulted  with  Drs.  Pierce  and  Geri 
on  their  ongoing  research  and  discussed  possible 
collaborations  between  the  Berkeley  and  Williams'  labs. 
This  led  to  an  equipment  loan  in  which  Williams  sent 
us  an  SGI  Crimson  workstation  and  a  Sony  CRT 
projector. 

In  2000,  the  PI  was  asked  to  join  a  team  that  would 
put  together  a  research  plan  for  the  Spatial 
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Disorientation  Program  for  the  Air  Force  Research 
Labs.  This  work  consisted  of  reviewing  the  previous 
research  plans,  evaluating  a  plan  written  by 
investigators  at  Wright-Patterson  AFB,  and  then 
traveling  to  Brooks  AFB  for  a  two-day  meeting  chaired 
by  Bill  Ercoline.  The  outcome  was  a  5-year  research 
plan  that  is  currently  being  evaluated  by  the  Air  Force. 

In  2000,  the  PI  participated  in  the  Civic  Outreach 
Program  for  two  days.  He  traveled  to  Moffett  Airfield, 
Edwards  AFB,  Cheyerme  Moimtain,  and  Peterson  AFB 
and  participated  in  briefings,  tours,  and  discussions 
with  Air  Force  personnel. 

In  2001,  the  PI  traveled  to  Sweden  to  participate  in  a 
workshop  with  SAAB  Military  Aircraft  Division.  The 
workshop  concerned  visual  problems  encountered  by 
aviators  and  possible  solutions  to  the  problems. 

7.  Significance  of  Research  Program 
for  Air  Force 

During  this  grant  period,  we  examined  visual 
navigation  and  space  perception  in  humans.  We  believe 
that  our  research  is  highly  relevant  to  the  military 
aviation  mission.  The  main  area  of  Air  Force  need  that 
is  addressed  by  our  research  is  spatial  disorientation 
(SD)  and  the  use  of  synthetic  or  enhanced  visual 
display  devices  such  as  head-mounted  displays 
(HMDs),  night- vision  goggles  (NVGs),  the  advanced 
aircraft  control  station  (ACS),  and  more. 

SD  remains  a  major  safety  problem  in  flight  and  SD 
is  likely  to  become  an  even  more  serious  problem  as  the 
next  wave  of  aircraft  (e.g.,  agile  flight)  is  developed  and 
put  into  flight.  Our  work  on  heading  perception  is 
aimed  at  determining  the  complex  of  visual  and  non¬ 
visual  signals  that  are  used  to  estimate  the  direction  of 
self-motion  and  ones  orientation  with  respect  to 
gravity.  Specifically,  we  are  working  on  determining 
how  much  weight  is  given  to  various  signals  (e.g.,  optic 
flow  vs  vestibular)  and  how  those  weights  depend  on 
the  viewing  situation  (e.g.,  weight  given  to  vestibular 
increases  as  the  optic  flow  information  is  degraded). 
With  a  better  imderstanding  of  how  the  human 
nervous  system  computes  and  weights  these  various 
sources  of  information  we  will  be  able  to  provide  the 
Air  Force  material  relevant  to  pilot  training,  cockpit 
design,  and  the  configuration  of  s)mthetic  visual 
displays.  Let  us  give  one  specific  example.  In  our  work 
on  the  somatogravic  ("pitch  up")  illusion,  we  are  trying 
to  determine  what  visual  cues  must  be  present  in  order 
to  override  the  vestibular-based  illusion  of  upward 
pitch.  Once  we  know  what  the  critical  visual  cues  are, 
we  can  recommend  the  design  of  an  artificial  cockpit 


display  (e.g.,  an  artificial  horizon)  that  would  minimize 
the  illusion. 

Our  work  on  space  perception,  primarily  slant  and 
curvature  perception  and  visual-haptic  integration,  is 
also  quite  relevant  to  the  military  aviation  mission.  In 
the  next  generation  of  military  aircraft,  we  will  see 
greater  and  greater  reliance  on  synthetic  visual 
displays.  Indeed,  if  the  closed  cockpit  (all  virtual) 
aircraft  is  brought  on  line,  all  of  the  visual  information 
provided  to  the  pilot  will  be  synthetic.  We  have  found 
that  perceived  depth  is  based  on  the  integration  of 
numerous  visual  (e.g.,  disparity  and  texture  gradient) 
and  non-visual  (e.g.,  eye  muscle  signals)  cues.  The  final 
percept  is  the  result  of  a  weighted  combination  of  those 
various  cues.  An  imderstanding  of  how  those  cues  are 
calculated  and  weighted  in  the  nervous  system  is 
critical  to  the  design  of  a  synthetic  visual  display.  For 
example,  we  know  from  our  work  that  cues  provided 
by  the  CRT  itself  (e.g.,  pixelization,  focus  cues)  cause 
perceptual  depth  compression.  Such  compression 
would  be  highly  imdesirable  in  an  all-virtual  cockpit 
and  so  the  design  of  the  visual  display  will  either  have 
to  reduce  the  influence  of  such  competing  cues  or 
figure  out  how  to  override  them. 

Finally,  our  software  development  might  also  be 
quite  useful  to  the  Air  Force.  At  a  meeting  at  Brooks 
AFB  in  March,  2000  (chaired  by  Bill  Ercoline),  a 
potential  business  plan  was  formed  for  the  next  5  years. 
One  idea  presented  in  this  plan  was  to  generate  web- 
based  instructional  aides  for  teaching  spatial 
disorientation  to  future  pilots.  One  of  our  software 
developments  -  FlightTools  -  would  allow  us  to 
recreate  flight  scenarios  that  could  be  played  on  the 
internet  for  use  in  the  classroom.  Those  scenarios  could 
be  seen  from  the  perspective  of  the  pilot  or  from  an 
outside  perspective  (chosen  by  the  student).  The 
scenarios  can  be  produced  on  standard  PCs  with  off- 
the-shelf  video  cards.  The  Banks  Lab  is  committed 
through  its  relationship  with  the  Air  Force  to  produce 
material  like  this  whenever  it  might  be  needed. 
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